library(vegabrite)
library(readr)
library(dplyr)Homework 5
Libraries
Exercise 1
- a. Graphic
- b. What marks are being used? What variables are mapped to which properties?
- The marks being used are points. Variables miles per gallon and weight are mapped to the point. Where each point represents a specific car.
- c. What is the main story of this graphic?
- This graphic, I think, is telling us a story about cars’ fuel efficiency in comparison to their weight. Furthermore, the graphic shows that the lower the weight, the better the fuel efficiency.
- d. What makes it a good graphic?
- For the most part, it’s an easy graphic to interpret. The labels are clear; however, there is room for improvement. For example, instead of having static labels for the cars, it could be interactive so that you can hover over the points and see the label. Right now, all the labels appear at once, causing many to overlap. Having the labels this way makes it hard to read the car names.
- e. What features do you think you would know how to implement in Vega-Lite?
- I think that I would be able to replicate the graphic.
- f. Are there any features of the graphic that you would not know how to do in Vega-Lite? If so, list them.
- I think I would have to investigate how to mark the labels. Although I might be able to make it a layer, if I want to implement the hovering interaction, I would need to research how to add this feature.
Exercise 2
- Create a graphic that shows the mean temperature for each month. How many “months” should you be displaying? (There is more than one answer to this – perhaps try doing it more that one way.)
Code
Rows: 2922 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): location, weather
dbl (8): precipitation, temp_max, temp_min, wind, year, month, day, day_of_...
date (1): date
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 6 × 11
location date precipitation temp_max temp_min wind weather year month
<chr> <date> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 Seattle 2012-01-01 0 12.8 5 4.7 drizzle 2012 1
2 Seattle 2012-01-02 10.9 10.6 2.8 4.5 rain 2012 1
3 Seattle 2012-01-03 0.8 11.7 7.2 2.3 rain 2012 1
4 Seattle 2012-01-04 20.3 12.2 5.6 4.7 rain 2012 1
5 Seattle 2012-01-05 1.3 8.9 2.8 6.1 rain 2012 1
6 Seattle 2012-01-06 2.5 4.4 2.2 2.2 rain 2012 1
# ℹ 2 more variables: day <dbl>, day_of_week <dbl>
num [1:2922] 1 1 1 1 1 1 1 1 1 1 ...
num [1:2922] 2012 2012 2012 2012 2012 ...
num [1:2922] 5 2.8 7.2 5.6 2.8 2.2 2.8 2.8 5 0.6 ...
num [1:2922] 12.8 10.6 11.7 12.2 8.9 4.4 7.2 10 9.4 6.1 ...
- Create a graphic that shows how the different types of weather (rain, fog, etc.) are distributed by month in Seattle. When is it rainiest in Seattle? Sunniest?
# Grouping by weather
weather_data <- weather_data |> group_by(weather)
# Creating the graphics
vl_chart(data = weather_data, height = 400, width = 600) |>
vl_config(title = list(text = "Weather Patterns by Month"))|>
vl_mark_bar() |>
vl_encode_x("month", title = "Month", type = "nominal", sort = "ascending") |>
vl_encode_y(aggregate = "count", title = "Count", type = "quantitative") |>
vl_encode_color("weather", type = "nominal", title = "Weather Type", ) |>
vl_encode_xOffset("weather", type = "nominal", title = "Weather Type")- Comments I tried to make the weather types match the colors, but for some reason, it did not allow me to change the scale and, therefore, the domain of the colors. The plot appeared, but the actual data did not. I tried to debug it but was unsuccessful.